How can I download an article?

To download an article from SID, first log in to the site, search for the article title, and click on the 'Download Article' option.

How can I download an ISI article?

To download an ISI article on SID, enter the keyword or article title in the search bar, view the relevant results, click on the desired article, and select the 'Download Article' option.

How can I access the SID database?

To access the SID database, visit SID.ir, create an account, and log in to access scientific resources.

Is downloading articles from SID free?

Some articles on SID are available for free, while others require payment. Details are specified on the article's page.

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Journal Article

Download

فارسی Version

Title:

LOCAL-LEARNING-BASED FEATURE SELECTION FOR High-dimensional data ANALYSIS

Author(s):

SUN Y. | TODOROVIC S.

Journal:

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (PAMI)

Issue Info:

Year:
2010
Volume:
32
Issue:
9
Pages:
1610-1626

Keywords:

Abstract:

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

Semi-Supervised Ensemble p-Laplacian Spectral Clustering for High dimensional data

Author(s):

Safari Sedighe | Afsari Fatemeh

Journal:

SIGNAL AND data PROCESSING

Issue Info:

Year:
2023
Volume:
20
Issue:
1
Pages:
39-58

Keywords:

Clustering Q4

Subspace Learning

Ensemble Learning

Semi-supervised Learning

Pairwise Constraints

Abstract:

Due to the increasing information and the detailed analysis of them, the clustering problems that detect the hidden patterns lie in the data are still of great importance. On the other hand, clustering of High-dimensional data using previous traditional methods has many limitations. In this study, a semi-supervised ensemble clustering method is proposed for a set of High-dimensional medical data. In the proposed method of this study, little information is available as prior knowledge using the information on similarity or dissimilarity (as a number of pairwise constraints). Initially using the transitive property, we generalize the pairwise constraints to all data. Then we divide the feature space into a number of sub-spaces, and to find the optimal clustering solution, the feature space is divided into an unequal number of sub-spaces randomly. A semi-supervised spectral clustering based on the p-Laplacian graph is performed at each sub-space independently. Specifically, to increase the accuracy of spectral clustering, we have used the spectral clustering method based on the p-Laplacian graph. The p-Laplacian graph is a nonlinear generalization of the Laplacian graph. The results of any clustering solutions are compared with the pairwise constraints and according to the level of matching, a degree of confidence is assigned to each clustering solution. Based on these degrees of confidence, an ensemble adjacency matrix is formed, which is the result of considering the results of all clustering solutions for each sub-space. This ensemble adjacency matrix is used in the final spectral clustering algorithm to find the clustering solution of the whole sub-space. Since the sub-spaces are generated randomly with an unequal number of features, clustering results are strongly influenced by different initial values. Therefore, it is necessary to find the optimal sub-space set. To this end, a search algorithm is designed to find the optimal sub-space set. The search process is initialized by forming several sets (we call each set an environment) consisting of several numbers of sub-spaces. An optimal environment is the one that has the best clustering results. The search algorithm utilized three search operators to find the optimal environment. The search operators search all the environments and the consequent sub-spaces both locally and globally. These operators combine two environments and/or replace an environment with a newly generated one. Each search operator tries to find the best possible environment in the entire search space or in a local space. We evaluate the performance of our proposed clustering schema on 20 cancer gene datasets. The normalized mutual information (NMI) criterion and the adjusted rand index (ARI) are used to evaluate the performance evaluation. We first examine the effect of a different number of pairwise constraints. As expected, with increasing the number of pairwise constraints, the efficiency of the proposed method also increases. For example, the NMI value increases from 0. 6 to 0. 9 on the Khan-2001 dataset, when the number of pairwise constraints increases from 20 to 100. More number of pairwise constraints means more information is available, which helps to improve the performance of the clustering algorithm. Furthermore, we examine the effect of the number of random subspaces. It is observed that increasing the number of random subspaces has a positive effect on clustering performance with respect to the NMI value. In most datasets, when the number of sub-spaces reaches 20, the performance of the proposed method does not change much and is stable. Examining the effect of sampling rate for random subspace generation shows that the proposed method has the best performance in most cancer datasets, such as Armstrong-2002-v3, and Bredel-2005 datasets, when the random subspace generation rate is 0. 5, and by deviating the rate from 0. 5, the level of satisfaction decreases. Then, the results of the proposed idea are compared with the results of the method proposed in the reference [21] according to ARI and we see that our proposed method has performed better in 12 data sets out of 20 data sets than the method proposed in the reference [21]. Finally, the proposed idea is compared with some metric learning approaches with respect to NMI. We have observed that the proposed method obtained the best results compared to other compared methods on 11 datasets out of 20 datasets. It also achieved the second-best result on 6 out of 20 datasets. For example, the value NMI obtained in the proposed method is 0. 1042 more than the reference [21] and it is 0. 1846 more than RCA and it is 0. 4 more than ITML and also it is 0. 468 more than DCA on the Bredel-2005 dataset. Utilizing ensemble clustering methods besides the confidence factor improves the ability of the proposed algorithm to achieve better results. Also, utilizing the transitive operators as well as the selection of random subspaces of unequal sizes play an important role in achieving better performance for the proposed algorithm. Using the p-Laplacian spectral clustering method produces a better, more balanced, and normal volume of clusters compared to the standard spectral clustering. Another effective approach to the performance of the proposed method is to use search operators to find the best subspace, which leads to better results.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 16 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

Neural Network Feature Selection (NNFS) for Incomplete and High-dimensional data

Author(s):

Bagherpour Negin | Ebrahimi Behrang

Journal:

JOURNAL OF ALGORITHMS AND COMPUTATION

Issue Info:

Year:
2025
Volume:
57
Issue:
1
Pages:
80-95

Keywords:

Abstract:

Feature selection is a critical step in machine learning, especially when dealing with High-dimensional and incomplete data. Traditional methods often struggle with missing values, which are common in real-world applications. This paper introduces Neural Network Feature Selection (NNFS), a novel deep learning-based approach that effectively identifies important features even in the presence of missing data. We provide a variety of comparisons to evaluate the suggested algorithm over existing methods. We demonstrate the accuracy, speed and sensitivity to missed data. According to numerical results, the proposed algorithm outperforms existing methods especially for medium size datasets. Both random and real tests are presented to make the results more realistic.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

Comparison of Clustering High dimensional data by Random Projections Method and Some Common Methods of dimensional Reduction

Author(s):

Nourani Pileh Roud S. | GOLALIZADEH M.

Journal:

JOURNAL OF STATISTICAL SCIENCES

Issue Info:

Year:
2022
Volume:
16
Issue:
1
Pages:
239-252

Keywords:

High dimensional data

Model-based clustering

Dimension reduction methods

Random Projections

Abstract:

Introduction The clustering of the High dimensional data is usually encountered with problems such as the curse of dimensionality. To overcome such obstacles, dimensionality reduction methods are often used. This view is typically referred to by two approaches,variable selection and variable extraction. Recently, researchers proposed a way that is claimed to lose less information in clustering High-dimensional data than other techniques. Among them, that presented by Anderlucci et al. (2021) under the title of Random Projections is very popular. The RP method is based on creating random projections, selecting a small subset, and then performing clustering tasks. Comparison and superiority of this method with conventional approaches of dimensionality reduction, using four critical criteria of clustering including adjusted Rand index, Jaccard index, Fowlkes-Malo index and the accuracy index is performed on three gene expression datasets in this article. Material and Methods One of the variable selection methods is the variable selection approach for clustering based on the Gaussian model. On the other hand, the principal components analysis method is one of the most popular methods for extracting variables. Another practical, new and exciting approach to performing dimensionality reduction is the Random Projections method. Using a group random projections, Andrelucci et al. (٢, ٠, ٢, ١, ) proposed clustering algorithm to cluster the High-dimensional data. This algorithm obtains the final output through Gaussian mixture model clustering applied to the optimal subset of random projections. Then, the original High-dimensional data is mapped onto the reduced spaces. Finally, model selection criteria are calculated for them and observations are clustered using optimal projections. Results and Discussion In this paper, the proposed methods by Anderlucci et al. (2021) are described and compared on three gene expression datasets, including leukaemia, lymphoma, and prostate cancers. Based on the gained results, using the introduced criteria, both competing methods have lower values than the random projections method and therefore have weaker performance. The final result is that the random projections method performs better for the three mentioned datasets. It should be noted that the purpose of the current study was only to compare the performance of clustering based on the three mentioned approaches and some different clustering criteria. So, other analytical aspects related to the random projection were not considered. Further exploration of these methods will be followed in our future research. Conclusion Clustering of High-dimensional data faces different statistical challenges, and various methods exist to overcome the related problems. One of these practical tools is reducing the data dimension. This article examined the random projection from both theoretical and practical aspects. Also, its performance was evaluated on three real data sets and compared with other standard methods, and its superiority was shown based on several conventional indicators of clustering measures. To conduct future research, one can address the probabilistic aspects of the random projections approach by considering proper statistical inference methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

A Permutation Test for Multiple Correlation Coefficient in High dimensional Normal data

Author(s):

Najarzadeh D.

Journal:

JOURNAL OF STATISTICAL SCIENCES

Issue Info:

Year:
2023
Volume:
17
Issue:
1
Pages:
201-218

Keywords:

Multiple correlation coefficient Q4

High-dimensional normal data

Sparse precision matrix

Plug-in estimator

Permutation test

Abstract:

In multiple regression analysis, the population multiple correlation coefficient (PMCC) is widely used to measure the correlation between a variable and a set of variables. To evaluate the existence or non-existence of this type of correlation, testing the hypothesis of zero PMCC can be very useful. In High-dimensional data, due to the singularity of the sample covariance matrix, traditional testing procedures to test this hypothesis lose their applicability. A simple test statistic was proposed for zero PMCC based on a plug-in estimator of the sample covariance matrix inverse. Then, a permutation test was constructed based on the proposed test statistic to test the null hypothesis. A simulation study was carried out to evaluate the performance of the proposed test in both High-dimensional and low-dimensional normal data sets. This study was finally ended by applying the proposed approach to mice tumour volumes data.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 76 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

Enhanced High-dimensional data Classification by Combining Fuzzy Learning Integration and Graph Transformers

Author(s):

Bastami S. | Mohammadiani R.P. | Dowlatshahi M.B. | Abdollahpouri A.

Journal:

Iranian Journal of Fuzzy Systems

Issue Info:

Year:
621
Volume:
22
Issue:
2
Pages:
129-146

Keywords:

Graph transformers Fuzzy systems Deep learning Social Network Analysis Q4

Abstract:

Graph neural networks and fuzzy models offer effective and practical methods for solving various tasks at the large-scale graph level. Large-scale graph embedding based on deep methods and fuzzy models is categorized into fusion and integration. Feature extraction and graph structure at the local and global levels are based on augmented graph fusion. In fusion-based graph embedding, the fuzzy model is used as an activation function based on an aggregated process. In some cases, the fusion of graph neural network methods with fuzzy systems has been successful. However, no effective methods have been developed for integrating fuzzy models with deep methods. Two main issues are associated with this integration: (1) computational complexity due to the exponential increase in fuzzy rules with the number of features, and (2) the complexity of the solution space due to the combination of fuzzy regression rules between inputs and outputs. Additionally, modeling at the large-scale graph level using linear regression and graph neural networks is not sufficient. Therefore, this paper proposes a feature and structure combination method at the local and global levels using a combination of fuzzy modeling and graph transformers, an integrated deep learning technique called Fuzzy Graph Transformer (FuzzyGT). We conducted experiments on deep learning graph datasets to compare with the proposed model. Our method achieved the best results compared to other advanced models

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

A HYBRID-BASED FEATURE SELECTION METHOD FOR High-dimensional data USING ENSEMBLE METHODS

Author(s):

ROUHI A. | NEZAMABADI POUR H.

Journal:

NASHRIYYAH -I MUHANDISI -I BARQ VA MUHANDISI -I KAMPYUTAR -I IRAN, B- MUHANDISI -I KAMPYUTAR

Issue Info:

Year:
2018
Volume:
15
Issue:
4
Pages:
283-294

Keywords:

FEATURE SELECTION Q1

High-dimensional data; HYBRID METHODS Q1

META-HEURISTIC METHODS Q1

FILTER METHODS

WRAPPER METHODS

ENSEMBLE METHODS

Abstract:

Nowadays, with the advent and proliferation of High-dimensional data, the process of feature selection plays an important role in the domain of machine learning and more specifically in the classification task. Dealing with High-dimensional data, e.g. microarrays, is associated with problems such as increased presence of redundant and irrelevant features, which leads to decreased classification accuracy, increased computational cost, and the curse of dimensionality. In this paper, a hybrid method using ensemble methods for feature selection of High dimensional data, is proposed. In the proposed method, in the first stage, a filter method reduces the dimensionality of features and then, in the second stage, two state-of-the-art wrapper methods run on the subset of reduced features using the ensemble technique. The proposed method is benchmarked using 8 microarray datasets. The comparison results with several state-of-the-art feature selection methods confirm the effectiveness of the proposed approach.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 5

Journal Article

Download

فارسی Version

Title:

Outlier Detection in High dimensional data Using Entropy-Based Locally Relevant Subspace Selection

Author(s):

Riahi Madvar M. | AKBARI AZIRANI A. | NASERSHARIF B.

Journal:

NASHRIYYAH -I MUHANDISI -I BARQ VA MUHANDISI -I KAMPYUTAR -I IRAN, B- MUHANDISI -I KAMPYUTAR

Issue Info:

Year:
2022
Volume:
19
Issue:
4
Pages:
302-312

Keywords:

Outlier detection

High dimensional data

locally relevant subspace selection

local entropy

Abstract:

One of the challenges of High dimensional outlier detection problem is the curse of dimensionality which irrelevant dimensions (features) lead to hidden outliers. To solve this problem, some dimensions that contain valuable information to detect outliers are searched to make outliers more prominent and detectable by mapping the dataset into the subspace which is constituted of these relevant dimensions/features. This paper proposes an outlier detection method in High dimensional data by introducing a new locally relevant subspace selection and developing a local density-based outlier scoring. First, we present a locally relevant subspace selection method based on local entropy to select a relevant subspace for each data point due to its neighbors. Then, each data point is scored in its relevant subspace using a density-based local outlier scoring method. Our adaptive-bandwidth kernel density estimation method eliminates the slight difference between the density of a normal data point and its neighbors. Thus, normal data are not wrongly detected as outliers. At the same time, our method underestimates the actual density of outlier data points to make them more prominent. The experimental results on several real datasets show that our local entropy-based subspace selection algorithm and the proposed outlier scoring can achieve a High accuracy detection rate for the outlier data.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

SFE: A simple, fast, and efficient feature selection algorithm for High-dimensional data

Author(s):

Journal:

IEEE TRANSACTIONS EVOLUTIONARY COMPUTATION

Issue Info:

Year:
2023
Volume:
27
Issue:
6
Pages:
1896-1911

Keywords:

Abstract:

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

Using synthetic data and dimensionality reduction in High-dimensional classification via logistic regression

Author(s):

Zarei Shaho | MOHAMMADPOUR ADEL

Journal:

Computational Methods for Differential Equations

Issue Info:

Year:
2019
Volume:
7
Issue:
4 (Special Issue)
Pages:
626-634

Keywords:

High-dimensional classification

Logistic regression classifir

dimensionality reduction

Ran-dom forest

Finite population Bayesian bootstrapping

Abstract:

Traditional logistic regression is plugged with degenerates and violent behavior in High-dimensional classification, because of the problem of non-invertible matrices in estimating model parameters. In this paper, to overcome the High-dimensionality of data, we introduce two new algorithms. First, we improve the efficiency of finite population Bayesian bootstrapping logistic regression classifier by using the rule of majority vote. Second, using simple random sampling without replacement to select a smaller number of covariates rather than the sample size and applying traditional logistic regression, we introduce the other new algorithm for High-dimensional binary classi cation. We compare the proposed algorithms with the regularized logistic regression models and two other classification algorithms, i. e., naive Bayes and K-nearest neighbors using both simulated and real data.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 85 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

ابتدا 1 2 3 4 5 6 7 8 9 10 انتها ›

بعدی

Scientific Information Database

ISSN: 2588-4824

Search Result

Relevance

Newest

Most Viewed

Most Downloaded

Most Cited